{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2 概率生成模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "考虑二分类的问题，类别$C_1$的后验概率可以写成\n",
    "\n",
    "$$\\begin {align}\n",
    "p(C_{1}|x)=\\frac{p( x|C_1)p(C_1)}{p(x|C_1)p(C_1)+p(\\mathbf x|C_2)p(C_2)}&=\\frac{1}{1+exp(-a)}=\\sigma(a)\\end {align}$$\n",
    "\n",
    "其中定义$$a=ln\\frac{p(x|C_1)p(C_1)}{p(x|C_2)p(C_2)}$$\n",
    "$$ \\sigma (a)=\\frac{1}{1+exp(-a)}$$\n",
    "其反函数为$a=ln(\\frac{\\sigma}{1-\\sigma})$,被称为logit函数\n",
    "\n",
    "对于多分类问题\n",
    "$$ p(C_k|x)=\\frac{p(x|C_K)p(C_K)}{\\sum_jp(x|C_j)p(C_j)}=\\frac{exp(a_K)}{\\sum_jexp(a_j)}$$\n",
    "也被称为归一化指数，可以被看作logistic函数的多分类情况的推广，$a_k$被定义为$$a_k=ln(p(x|C_K)p(C_K))$$\n",
    "\n",
    "进一步假设所有类别的协方差矩阵相同，但每个类别的均值不同，服从高斯分布\n",
    "$$p(x|C_k)=\\frac{1}{{2\\pi}^{\\frac{D}{2}}}\\frac{1}{{|\\Sigma|}^{\\frac{1}{2}}}exp\\{-\\frac{1}{2}(x-\\mu_k)^T\\Sigma ^{-1}(x-\\mu_k)\\}$$\n",
    "考虑二分类情形,将条件概率代入$a$的表达式写成与$x$相关的线性表达式:\n",
    "$$p(C_1|x)=\\sigma(w^Tx+w_0)$$\n",
    "其中$$w=\\Sigma^{-1}(\\mu_1-\\mu_2)$$\n",
    "$$w_0=-\\frac{1}{2}\\mu_1^T\\Sigma^{-1}\\mu_1+\\frac{1}{2}\\mu_2^T\\Sigma^{-1}\\mu_2+ln\\frac{p(C_1)}{p(C_2)}$$\n",
    "\n",
    "我们使⽤最⼤似然⽅法\n",
    "调节了⾼斯类条件概率密度，那么我们有2M个参数来描述均值，以及$\\frac{M(M+1)}{2}$个参数来描述\n",
    "（共享的）协⽅差矩阵。算上类先验p(C1)，参数的总数为$\\frac{M(M+5)}{2}+1$\n",
    "\n",
    "\n",
    "所以得到了参数为x的线性函数的logistic sigmoid函数，最终求得的决策边界对应于后验概率p(Ck j x)为常数的决策⾯，因此由x的线性函数给出，从⽽决策边界在输⼊空间是线性的。先验概率密度$p(C_k)$只出现在偏置参数$w_0$中，因此先验的改变的效果是平移决策边界，即平移后验概率中的常数轮廓线。\n",
    "\n",
    "对于k个类别有：$$a_k(x)=w_k^Tx+w_{k0}$$ $$w_k={\\Sigma}^{-1}\\mu_k$$ $$w_{k0}=-\\frac{1}{2}\\mu_k^T{\\Sigma}^{-1}\\mu_k+lnp(C_k)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2.2最大似然"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "假设:$p(C_1)=\\pi$,则$p(C_2)=1-\\pi$,训练数据为${x_n,t_n}$,若$x_n\\in C_1$,则$t_n=1$,否则$t_n=0$。\n",
    "于是有：$$p(x_n,C_1)=\\pi N(x_n|\\mu_1,\\Sigma)$$\n",
    "$$p(x_n,C_2)=(1-\\pi)N(x_n|\\mu_2,\\Sigma)$$\n",
    "那么似然函数为：$$p(t|\\pi,\\mu_1,\\mu_2,\\Sigma)=\\prod_{n=1}^N{[\\pi N(x_n|\\mu_1,\\Sigma)]}^{t_n}{[(1-\\pi)N(x_n|\\mu_2,\\Sigma)]}^{1-t_n}$$\n",
    "\n",
    "结果为：$\\pi=\\frac{N_1}{N_1+N_2}$,$\\mu_1=\\frac{1}{N_1}\\sum_{n=1}^{N}t_nx_n$,$\\mu_2=\\frac{1}{N_2}\\sum_{n=1}^{N}(1-t_n)x_n$\n",
    "\n",
    "最后考虑协方差的最大似然解：\n",
    "$$\\Sigma=S=\\frac{N_1}{N}S_1+{N_2}{N}{S_2}$$\n",
    "$$S_1=\\frac{1}{N_1}\\sum_{n\\in C_1}(x_n-\\mu_1)(x_n-\\mu_1)^T$$\n",
    "$$S_2=\\frac{1}{N_2}\\sum_{n\\in C_2}(x_n-\\mu_2(x_n-\\mu_2)^T$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2.3 离散特征"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于D"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
